Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 573
Filter
2.
J Comput Assist Tomogr ; 47(2): 199-204, 2023.
Article in English | MEDLINE | ID: mdl-36790871

ABSTRACT

PURPOSE: Previous studies have pointed out that magnetic resonance- and fluorodeoxyglucose positron emission tomography-based radiomics had a high predictive value for the response of the neoadjuvant chemotherapy (NAC) in breast cancer by respectively characterizing tumor heterogeneity of the relaxation time and the glucose metabolism. However, it is unclear whether computed tomography (CT)-based radiomics based on density heterogeneity can predict the response of NAC. This study aimed to develop and validate a CT-based radiomics nomogram to predict the response of NAC in breast cancer. METHODS: A total of 162 breast cancer patients (110 in the training cohort and 52 in the validation cohort) who underwent CT scans before receiving NAC and had pathological response results were retrospectively enrolled. Grades 4 to 5 cases were classified as response to NAC. According to the Miller-Payne grading system, grades 1 to 3 cases were classified as nonresponse to NAC. Radiomics features were extracted, and the optimal radiomics features were obtained to construct a radiomics signature. Multivariate logistic regression was used to develop the clinical prediction model and the radiomics nomogram that incorporated clinical characteristics and radiomics score. We assessed the performance of different models, including calibration and clinical usefulness. RESULTS: Eight optimal radiomics features were obtained. Human epidermal growth factor receptor 2 status and molecular subtype showed statistical differences between the response group and the nonresponse group. The radiomics nomogram had more favorable predictive efficacy than the clinical prediction model (areas under the curve, 0.82 vs 0.70 in the training cohort; 0.79 vs 0.71 in the validation cohort). The Delong test showed that there are statistical differences between the clinical prediction model and the radiomics nomogram ( z = 2.811, P = 0.005 in the training cohort). The decision curve analysis showed that the radiomics nomogram had higher overall net benefit than the clinical prediction model. CONCLUSION: The radiomics nomogram based on CT radiomics signature and clinical characteristics has favorable predictive efficacy for the response of NAC in breast cancer.


Subject(s)
Breast Neoplasms , Computational Biology , Tomography, X-Ray Computed , Computational Biology/standards , Tomography, X-Ray Computed/standards , Neoadjuvant Therapy , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/drug therapy , Predictive Value of Tests , Retrospective Studies , Models, Statistical , Humans , Female , Adult , Middle Aged , Reproducibility of Results
4.
Brief Bioinform ; 23(2)2022 03 10.
Article in English | MEDLINE | ID: mdl-35189635

ABSTRACT

Protein lysine crotonylation (Kcr) is an important type of posttranslational modification that is associated with a wide range of biological processes. The identification of Kcr sites is critical to better understanding their functional mechanisms. However, the existing experimental techniques for detecting Kcr sites are cost-ineffective, to a great need for new computational methods to address this problem. We here describe Adapt-Kcr, an advanced deep learning model that utilizes adaptive embedding and is based on a convolutional neural network together with a bidirectional long short-term memory network and attention architecture. On the independent testing set, Adapt-Kcr outperformed the current state-of-the-art Kcr prediction model, with an improvement of 3.2% in accuracy and 1.9% in the area under the receiver operating characteristic curve. Compared to other Kcr models, Adapt-Kcr additionally had a more robust ability to distinguish between crotonylation and other lysine modifications. Another model (Adapt-ST) was trained to predict phosphorylation sites in SARS-CoV-2, and outperformed the equivalent state-of-the-art phosphorylation site prediction model. These results indicate that self-adaptive embedding features perform better than handcrafted features in capturing discriminative information; when used in attention architecture, this could be an effective way of identifying protein Kcr sites. Together, our Adapt framework (including learning embedding features and attention architecture) has a strong potential for prediction of other protein posttranslational modification sites.


Subject(s)
Computational Biology , Deep Learning , Lysine/metabolism , Protein Processing, Post-Translational , Software , Algorithms , Benchmarking , Computational Biology/methods , Computational Biology/standards , Databases, Factual , Neural Networks, Computer , Phosphorylation , ROC Curve , Reproducibility of Results , User-Computer Interface
5.
J Parasitol ; 108(1): 79-87, 2022 01 01.
Article in English | MEDLINE | ID: mdl-35171246

ABSTRACT

Echinococcosis is a zoonotic disease with great significance to public health, and appropriate detection and control strategies should be adopted to mitigate its impact. Most cases of echinococcosis are believed to be transmitted by the consumption of food and/or water contaminated with canid stool containing Echinococcus spp. eggs. Studies assessing Echinococcus multilocularis, Echinococcus granulosus sensu stricto, and Echinococcus shiquicus coinfection from contaminated water-derived, soil-derived, and food-borne samples are scarce, which may be due to the lack of optimized laboratory detection methods. The present study aimed to develop and evaluate a novel triplex TaqMan-minor groove binder probe for real-time polymerase chain reaction (rtPCR) to simultaneously detect the 3 Echinococcus spp. mentioned above from canid fecal samples in the Qinghai-Tibetan Plateau area (QTPA). The efficiency and linearity of each signal channel in the triplex rtPCR assay were within acceptable limits for the range of concentrations tested. Furthermore, the method was shown to have good repeatability (standard deviation ≤0.32 cycle threshold), and the limit of detection was estimated to be 10 copies plasmid/µl reaction. In summary, the evaluation of the present method shows that the newly developed triplex rtPCR assay is a highly specific, precise, consistent, and stable method that could be used in epidemiological investigations of echinococcosis.


Subject(s)
Canidae/parasitology , Dog Diseases/parasitology , Echinococcosis/veterinary , Echinococcus/isolation & purification , Feces/parasitology , Multiplex Polymerase Chain Reaction/veterinary , Animals , Computational Biology/standards , DNA, Helminth/isolation & purification , Dogs , Echinococcosis/parasitology , Echinococcus/classification , Echinococcus/genetics , Foxes/parasitology , Limit of Detection , Multiplex Polymerase Chain Reaction/methods , Multiplex Polymerase Chain Reaction/standards , Reproducibility of Results , Sensitivity and Specificity , Soil/parasitology
6.
Viruses ; 14(2)2022 01 19.
Article in English | MEDLINE | ID: mdl-35215779

ABSTRACT

Whole-genome sequencing of viral isolates is critical for informing transmission patterns and for the ongoing evolution of pathogens, especially during a pandemic. However, when genomes have low variability in the early stages of a pandemic, the impact of technical and/or sequencing errors increases. We quantitatively assessed inter-laboratory differences in consensus genome assemblies of 72 matched SARS-CoV-2-positive specimens sequenced at different laboratories in Sydney, Australia. Raw sequence data were assembled using two different bioinformatics pipelines in parallel, and resulting consensus genomes were compared to detect laboratory-specific differences. Matched genome sequences were predominantly concordant, with a median pairwise identity of 99.997%. Identified differences were predominantly driven by ambiguous site content. Ignoring these produced differences in only 2.3% (5/216) of pairwise comparisons, each differing by a single nucleotide. Matched samples were assigned the same Pango lineage in 98.2% (212/216) of pairwise comparisons, and were mostly assigned to the same phylogenetic clade. However, epidemiological inference based only on single nucleotide variant distances may lead to significant differences in the number of defined clusters if variant allele frequency thresholds for consensus genome generation differ between laboratories. These results underscore the need for a unified, best-practices approach to bioinformatics between laboratories working on a common outbreak problem.


Subject(s)
Computational Biology/standards , Consensus , Genome, Viral , Laboratories/standards , Public Health , SARS-CoV-2/genetics , Australia , Computational Biology/methods , Humans , Phylogeny , SARS-CoV-2/classification , Whole Genome Sequencing
7.
PLoS One ; 17(1): e0262615, 2022.
Article in English | MEDLINE | ID: mdl-35041695

ABSTRACT

Although several studies have been conducted to summarize the progress of open educational resources (OER) in specific regions, only a limited number of studies summarize OER in Africa. Therefore, this paper presents a systematic literature review to explore trends, themes, and patterns in this emerging area of study, using content and bibliometric analysis. Findings indicated three major strands of OER research in Africa: (1) OER adoption is only limited to specific African countries, calling for more research and collaboration between African countries in this field to ensure educational equity; (2) most of the OER initiatives in Africa have focused on the creation process and neglected other important perspectives, such as dissemination and open educational practices (OEP) using OER; and (3) on top of the typical challenges for OER adoption (e.g., infrastructure), other personal challenges were identified within the African context, including culture, language, and personality. The findings of this study suggest that more initiatives and cross-collaborations with African and non-African countries in the field of OER are needed to facilitate OER adoption in the region. Additionally, it is suggested that researchers and practitioners should consider individual differences, such as language, personality and culture, when promoting and designing OER for different African countries. Finally, the findings can promote social justice by providing insights and future research paths that different stakeholders (e.g., policy makers, educators, practitioners, etc.) should focus on to promote OER in Africa.


Subject(s)
Biological Science Disciplines/education , Computational Biology/standards , Education, Distance/standards , Research Personnel/education , Africa , Bibliometrics , Humans , Research Personnel/statistics & numerical data
8.
Nucleic Acids Res ; 50(D1): D1515-D1521, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34986598

ABSTRACT

The Evidence and Conclusion Ontology (ECO) is a community resource that provides an ontology of terms used to capture the type of evidence that supports biomedical annotations and assertions. Consistent capture of evidence information with ECO allows tracking of annotation provenance, establishment of quality control measures, and evidence-based data mining. ECO is in use by dozens of data repositories and resources with both specific and general areas of focus. ECO is continually being expanded and enhanced in response to user requests as well as our aim to adhere to community best-practices for ontology development. The ECO support team engages in multiple collaborations with other ontologies and annotating groups. Here we report on recent updates to the ECO ontology itself as well as associated resources that are available through this project. ECO project products are freely available for download from the project website (https://evidenceontology.org/) and GitHub (https://github.com/evidenceontology/evidenceontology). ECO is released into the public domain under a CC0 1.0 Universal license.


Subject(s)
Computational Biology/standards , Databases, Genetic , Gene Ontology , Software , Humans , Molecular Sequence Annotation
9.
Nucleic Acids Res ; 50(2): e7, 2022 01 25.
Article in English | MEDLINE | ID: mdl-34648021

ABSTRACT

Single-cell RNA sequencing has become a powerful tool for identifying and characterizing cellular heterogeneity. One essential step to understanding cellular heterogeneity is determining cell identities. The widely used strategy predicts identities by projecting cells or cell clusters unidirectionally against a reference to find the best match. Here, we develop a bidirectional method, scMRMA, where a hierarchical reference guides iterative clustering and deep annotation with enhanced resolutions. Taking full advantage of the reference, scMRMA greatly improves the annotation accuracy. scMRMA achieved better performance than existing methods in four benchmark datasets and successfully revealed the expansion of CD8 T cell populations in squamous cell carcinoma after anti-PD-1 treatment.


Subject(s)
Biomarkers , Computational Biology/methods , Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Single-Cell Analysis , Software , Algorithms , Cluster Analysis , Computational Biology/standards , Databases, Genetic , Gene Expression Profiling/standards , Humans , Molecular Sequence Annotation , Reproducibility of Results , Sequence Analysis, RNA/standards , Single-Cell Analysis/methods
10.
Nucleic Acids Res ; 50(D1): D1475-D1482, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34554254

ABSTRACT

Nearly 200 plant genomes have been sequenced over the last two years, and new functions of plant microRNAs (miRNAs) have been revealed. Therefore, timely update of the plant miRNA databases by incorporating miRNAs from the newly sequenced species and functional information is required to provide useful resources for advancing plant miRNA research. Here we report the update of PmiREN2.0 (https://pmiren.com/) with an addition of 19 363 miRNA entries from 91 plants, doubling the amount of data in the original version. Meanwhile, abundant regulatory information centred on miRNAs was added, including predicted upstream transcription factors through binding motifs scanning and elaborate annotation of miRNA targets. As an example, a genome-wide regulatory network centred on miRNAs was constructed for Arabidopsis. Furthermore, phylogenetic trees of conserved miRNA families were built to expand the understanding of miRNA evolution across the plant lineages. These data are helpful to deduce the regulatory relationships concerning miRNA functions in diverse plants. Beside the new data, a suite of design tools was incorporated to facilitate experimental practice. Finally, a forum named 'PmiREN Community' was added for discussion and resource and new discovery sharing. With these upgrades, PmiREN2.0 should serve the community better and accelerate miRNA research in plants.


Subject(s)
Databases, Genetic , MicroRNAs/genetics , Plants/genetics , Software , Computational Biology/standards , Gene Expression Regulation, Plant/genetics , Genome, Plant/genetics , MicroRNAs/classification
11.
Mol Genet Genomics ; 297(1): 33-46, 2022 Jan.
Article in English | MEDLINE | ID: mdl-34755217

ABSTRACT

Based on molecular markers, genomic prediction enables us to speed up breeding schemes and increase the response to selection. There are several high-throughput genotyping platforms able to deliver thousands of molecular markers for genomic study purposes. However, even though its widely applied in plant breeding, species without a reference genome cannot fully benefit from genomic tools and modern breeding schemes. We used a method to assemble a population-tailored mock genome to call single-nucleotide polymorphism (SNP) markers without an available reference genome, and for the first time, we compared the results with standard genotyping platforms (array and genotyping-by-sequencing (GBS) using a reference genome) for performance in genomic prediction models. Our results indicate that using a population-tailored mock genome to call SNP delivers reliable estimates for the genomic relationship between genotypes. Furthermore, genomic prediction estimates were comparable to standard approaches, especially when considering only additive effects. However, mock genomes were slightly worse than arrays at predicting traits influenced by dominance effects, but still performed as well as standard GBS methods that use a reference genome. Nevertheless, the array-based SNP markers methods achieved the best predictive ability and reliability to estimate variance components. Overall, the mock genomes can be a worthy alternative for genomic selection studies, especially for those species where the reference genome is not available.


Subject(s)
Computational Biology , Genotyping Techniques , Models, Genetic , Animals , Chimera/genetics , Computational Biology/methods , Computational Biology/standards , Datasets as Topic , Genome , Genome-Wide Association Study/methods , Genome-Wide Association Study/standards , Genomics/methods , Genomics/standards , Genotype , Genotyping Techniques/methods , Genotyping Techniques/standards , Phenotype , Reference Standards , Reproducibility of Results , Selection, Genetic , Species Specificity , Zea mays/classification , Zea mays/genetics
12.
Nucleic Acids Res ; 50(D1): D1522-D1527, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34871441

ABSTRACT

The rapid development of proteomics studies has resulted in large volumes of experimental data. The emergence of big data platform provides the opportunity to handle these large amounts of data. The integrated proteome resource, iProX (https://www.iprox.cn), which was initiated in 2017, has been greatly improved with an up-to-date big data platform implemented in 2021. Here, we describe the main iProX developments since its first publication in Nucleic Acids Research in 2019. First, a hyper-converged architecture with high scalability supports the submission process. A hadoop cluster can store large amounts of proteomics datasets, and a distributed, RESTful-styled Elastic Search engine can query millions of records within one second. Also, several new features, including the Universal Spectrum Identifier (USI) mechanism proposed by ProteomeXchange, RESTful Web Service API, and a high-efficiency reanalysis pipeline, have been added to iProX for better open data sharing. By the end of August 2021, 1526 datasets had been submitted to iProX, reaching a total data volume of 92.42TB. With the implementation of the big data platform, iProX can support PB-level data storage, hundreds of billions of spectra records, and second-level latency service capabilities that meet the requirements of the fast growing field of proteomics.


Subject(s)
Databases, Protein , Proteome/genetics , Proteomics , Software , Big Data , Computational Biology/standards , Information Dissemination
13.
Sci Rep ; 11(1): 23747, 2021 12 09.
Article in English | MEDLINE | ID: mdl-34887492

ABSTRACT

Among an assortment of genetic variations, Missense are major ones which a small subset of them may led to the upset of the protein function and ultimately end in human diseases. Various machine learning methods were declared to differentiate deleterious and benign missense variants by means of a large number of features, including structure, sequence, interaction networks, gene disease associations as well as phenotypes. However, development of a reliable and accurate algorithm for merging heterogeneous information is highly needed as it could be captured all information of complex interactions on network that genes participate in. In this study we proposed a new method based on the non-negative matrix tri-factorization clustering method. We outlined two versions of the proposed method: two-source and three-source algorithms. Two-source algorithm aggregates individual deleteriousness prediction methods and PPI network, and three-source algorithm incorporates gene disease associations into the other sources already mentioned. Four benchmark datasets were employed for internally and externally validation of both algorithms of our predictor. The results at all datasets confirmed that, our method outperforms most state of the art variant prediction tools. Two key features of our variant effect prediction method are worth mentioning. Firstly, despite the fact that the incorporation of gene disease information at three-source algorithm can improve prediction performance by comparison with two-source algorithm, our method did not hinder by type 2 circularity error unlike some recent ensemble-based prediction methods. Type 2 circularity error occurs when the predictor annotates variants on the basis of the genes located on. Secondly, the performance of our predictor is superior over other ensemble-based methods for variants positioned on genes in which we do not have enough information about their pathogenicity.


Subject(s)
Computational Biology/methods , Genetic Association Studies , Mutation, Missense , Supervised Machine Learning , Algorithms , Computational Biology/standards , Humans , ROC Curve , Reproducibility of Results , Systems Biology/methods
14.
Eur Rev Med Pharmacol Sci ; 25(1 Suppl): 1-6, 2021 12.
Article in English | MEDLINE | ID: mdl-34890028

ABSTRACT

OBJECTIVE: While the bioinformatic workflow, from quality control to annotation, is quite standardized, the interpretation of variants is still a challenge. The decreasing cost of massively parallel NGS has produced hundreds of variants per patient to analyze and interpret. The ACMG "Standards and guidelines for the interpretation of sequence variants", widely adopted in clinical settings, assume that the clinician has a comprehensive knowledge of the literature and the disease. MATERIALS AND METHODS: To semi-automatize the application of the guidelines, we decided to develop an algorithm that exploits VarSome, a widely used platform that interprets variants on the basis of information from more than 70 genome databases. RESULTS: Here we explain how we integrated VarSome API into our existing clinical diagnostic pipeline for NGS data to obtain validated reproducible results as indicated by accuracy, sensitivity and specificity. CONCLUSIONS: We validated the automated pipeline to be sure that it was doing what we expected. We obtained 100% sensitivity, specificity and accuracy, confirming that it was suitable for use in a diagnostic setting.


Subject(s)
Algorithms , Genetic Variation/genetics , Genomics/standards , High-Throughput Nucleotide Sequencing/standards , Practice Guidelines as Topic/standards , Search Engine/standards , Computational Biology/methods , Computational Biology/standards , Genomics/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Search Engine/methods , Sequence Analysis, DNA/methods , Sequence Analysis, DNA/standards
15.
Genes (Basel) ; 12(12)2021 11 25.
Article in English | MEDLINE | ID: mdl-34946832

ABSTRACT

Variant interpretation is challenging as it involves combining different levels of evidence in order to evaluate the role of a specific variant in the context of a patient's disease. Many in-depth refinements followed the original 2015 American College of Medical Genetics (ACMG) guidelines to overcome subjective interpretation of criteria and classification inconsistencies. Here, we developed an ACMG-based classifier that retrieves information for variant interpretation from the VarSome Stable-API environment and allows molecular geneticists involved in clinical reporting to introduce the necessary changes to criterion strength and to add or exclude criteria assigned automatically, ultimately leading to the final variant classification. We also developed a modified ACMG checklist to assist molecular geneticists in adjusting criterion strength and in adding literature-retrieved or patient-specific information, when available. The proposed classifier is an example of integration of automation and human expertise in variant curation, while maintaining the laboratory analytical workflow and the established bioinformatics pipeline.


Subject(s)
Genetic Variation/genetics , Genome, Human/genetics , Genomics/standards , Computational Biology/standards , Genetic Testing/standards , Humans
16.
Genes (Basel) ; 12(11)2021 11 18.
Article in English | MEDLINE | ID: mdl-34828413

ABSTRACT

Inherited bleeding disorders (IBDs) are the most frequent congenital diseases in the Colombian population; three of them are hemophilia A (HA), hemophilia B (HB), and von Willebrand Disease (VWD). Currently, diagnosis relies on multiple clinical laboratory assays to assign a phenotype. Due to the lack of accessibility to these tests, patients can receive an incomplete diagnosis. In these cases, genetic studies reinforce the clinical diagnosis. The present study characterized the molecular genetic basis of 11 HA, three HB, and five VWD patients by sequencing the F8, F9, or the VWF gene. Twelve variations were found in HA patients, four in HB patients, and 19 in WVD patients. From these variations a total of 25 novel variations were found. Disease-causing variations were used as positive controls for validation of the high-resolution melting (HRM) variant-scanning technique. This approach is a low-cost genetic diagnostic method proposed to be incorporated in developing countries. For the data analysis, we developed an accessible open-source code in Python that improves HRM data analysis with better sensitivity of 95% and without bias when using different HRM equipment and software. Analysis of amplicons with a length greater than 300 bp can be performed by implementing an analysis by denaturation domains.


Subject(s)
Blood Coagulation Disorders, Inherited/diagnosis , Computational Biology/methods , Factor IX/genetics , Genetic Testing/methods , Hemophilia A/genetics , von Willebrand Factor/genetics , Blood Coagulation Disorders, Inherited/genetics , Colombia , Computational Biology/economics , Computational Biology/standards , Costs and Cost Analysis , Factor IX/chemistry , Genetic Testing/economics , Genetic Testing/standards , Hemophilia A/diagnosis , Humans , Protein Domains , Sensitivity and Specificity , von Willebrand Factor/chemistry
17.
Nat Methods ; 18(12): 1496-1498, 2021 12.
Article in English | MEDLINE | ID: mdl-34845388

ABSTRACT

The rapid pace of innovation in biological imaging and the diversity of its applications have prevented the establishment of a community-agreed standardized data format. We propose that complementing established open formats such as OME-TIFF and HDF5 with a next-generation file format such as Zarr will satisfy the majority of use cases in bioimaging. Critically, a common metadata format used in all these vessels can deliver truly findable, accessible, interoperable and reusable bioimaging data.


Subject(s)
Computational Biology/instrumentation , Computational Biology/standards , Metadata , Microscopy/instrumentation , Microscopy/standards , Software , Benchmarking , Computational Biology/methods , Data Compression , Databases, Factual , Information Storage and Retrieval , Internet , Microscopy/methods , Programming Languages , SARS-CoV-2
20.
Am J Hum Genet ; 108(10): 1891-1906, 2021 10 07.
Article in English | MEDLINE | ID: mdl-34551312

ABSTRACT

The success of personalized genomic medicine depends on our ability to assess the pathogenicity of rare human variants, including the important class of missense variation. There are many challenges in training accurate computational systems, e.g., in finding the balance between quantity, quality, and bias in the variant sets used as training examples and avoiding predictive features that can accentuate the effects of bias. Here, we describe VARITY, which judiciously exploits a larger reservoir of training examples with uncertain accuracy and representativity. To limit circularity and bias, VARITY excludes features informed by variant annotation and protein identity. To provide a rationale for each prediction, we quantified the contribution of features and feature combinations to the pathogenicity inference of each variant. VARITY outperformed all previous computational methods evaluated, identifying at least 10% more pathogenic variants at thresholds achieving high (90% precision) stringency.


Subject(s)
Algorithms , Computational Biology/standards , Disease/etiology , Mutation, Missense , Genetic Predisposition to Disease , Humans , Phenotype , Precision Medicine , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...